15 research outputs found

    The VINEYARD Approach: Versatile, Integrated, Accelerator-Based, Heterogeneous Data Centres.

    Get PDF
    Emerging web applications like cloud computing, Big Data and social networks have created the need for powerful centres hosting hundreds of thousands of servers. Currently, the data centres are based on general purpose processors that provide high flexibility buts lack the energy efficiency of customized accelerators. VINEYARD aims to develop an integrated platform for energy-efficient data centres based on new servers with novel, coarse-grain and fine-grain, programmable hardware accelerators. It will, also, build a high-level programming framework for allowing end-users to seamlessly utilize these accelerators in heterogeneous computing systems by employing typical data-centre programming frameworks (e.g. MapReduce, Storm, Spark, etc.). This programming framework will, further, allow the hardware accelerators to be swapped in and out of the heterogeneous infrastructure so as to offer high flexibility and energy efficiency. VINEYARD will foster the expansion of the soft-IP core industry, currently limited in the embedded systems, to the data-centre market. VINEYARD plans to demonstrate the advantages of its approach in three real use-cases (a) a bio-informatics application for high-accuracy brain modeling, (b) two critical financial applications, and (c) a big-data analysis application

    Proposal of an FPGA hardware architecture for SLAM using multi-cameras and applied to mobile robotics

    No full text
    Este trabalho apresenta uma arquitetura de hardware, baseada em FPGA (Field-Programmable Gate Array) e com multi-câmeras, para o problema de localização e mapeamento simultâneos - SLAM (Simultaneous Localization And Mapping) aplicada a sistemas robóticos embarcados. A arquitetura é composta por módulos de hardware altamente especializados para a localização do robô e para geração do mapa do ambiente de navegação em tempo real com features extraídas de imagens obtidas diretamente de câmeras CMOS a uma velocidade de 30 frames por segundo. O sistema é totalmente embarcado em FPGA e apresenta desempenho superior em, pelo menos, uma ordem de magnitude em relaçãoo às implementações em software processadas por computadores pessoais de última geração. Esse desempenho deve-se à exploração do paralelismo em hardware junto com o processamento em pipeline e às otimizações realizadas nos algoritmos. As principais contribuições deste trabalho são as arquiteturas para o filtro de Kalman estendido - EKF (Extended Kalman Filter) e para a detecção de features baseada no algoritmo SIFT (Scale Invariant Feature Transform). A complexidade para a implementaçãoo deste trabalho pode ser considerada alta, uma vez que envolve uma grande quantidade de operações aritméticas e trigonométricas em ponto utuante e ponto fixo, um intenso processamento de imagens para extração de features e verificação de sua estabilidade e o desenvolvimento de um sistema de aquisição de imagens para quatro câmeras CMOS em tempo real. Adicionalmente, foram criadas interfaces de comunicação para o software e o hardware embarcados no FPGA e para o controle e leitura dos sensores do robô móvel. Além dos detalhes e resultados da implementação, neste trabalho são apresentados os conceitos básicos de mapeamento e o estado da arte dos algoritmos SLAM com visão monocular e estéreoThis work presents a hardware architecture for the Simultaneous Localization And Mapping (SLAM) problem applied to embedded robots. This architecture, which is based on FPGA and multi-cameras, is composed by highly specialized blocks for robot localization and feature-based map building in real time from images read directly from CMOS cameras at 30 frames per second. The system is completely embedded on an FPGA and its performance is at least one order of magnitude better than a high end PC-based implementation. This result is achieved by investigating the impact of several hardwareorientated optimizations on performance and by exploiting hardware parallelism along with pipeline processing. The main contributions of this work are the architectures for the Extended Kalman Filter (EKF) and for the feature detection system based on the SIFT (Scale Invariant Feature Transform). The complexity to implement this work can be considered high, as it involves a significant number of arithmetic and trigonometric operations in oating and fixed-point format, an intensive image processing for feature detection and stability checking, and the development of an image acquisition system from four CMOS cameras in real time. In addition, communication interfaces were created to integrate software and hardware embedded on FPGA and to control the mobile robot base and to read its sensors. Finally, besides the implementation details and the results, this work also presents basic concepts about mapping and state-of-the-art algorithms for SLAM with monocular and stereo vision

    A project of a module for acquisition and color image pre-processing based on reconfigurable computation and applied to mobile robots

    No full text
    Este trabalho propõe um módulo básico de aquisição e pré-processamento de imagem colorida aplicado a robôs móveis, implementado em hardware reconfigurável, dentro do conceito de sistemas SoC (System-on-a-Chip). O módulo básico é apresentado em conjunto com funções mais específicas de pré-processamento de imagem, que são utilizadas como base para a verificação das funcionalidades implementadas no trabalho proposto. As principais funções realizadas pelo módulo básico são: montagem de frames a partir dos pixels obtidos da câmera digital CMOS, controle dos diversos parâmetros de configuração da câmera e conversão de padrões de cores. Já as funções mais específicas abordam as etapas de segmentação, centralização, redução e interpretação das imagens adquiridas. O tipo de dispositivo reconfigurável utilizado neste trabalho é o FPGA (Field-Programmable Gate Array), que permite maior adequação das funções específicas às necessidades das aplicações, tendo sempre como base o módulo proposto. O sistema foi aplicado para reconhecer gestos e obteve a taxa 99,57% de acerto operando a 31,88 frames por segundo.This work proposes a basic module for a mobile robot color image capture and pre-processing, implemented in reconfigurable hardware based on SoC (System-on-a-Chip). The basic module is presented with a specifics image pre-processing function that are used as a base for verify the functionalities implemented in this research. The mains functions implemented on this basic module are: to read the pixels provide by the CMOS camera for compose the frame, to adjust the parameters of the camera control and to convert color space. The specifics image pre-processing functions are used to do image segmentation, centralization, reduction and image classification. The reconfigurable dispositive used in this research is the FPGA (Field-Programmable Gate Array) that permit to adapt the specific function according to the application needs. The system was applied to recognize gesture and had 99,57% rate of true recognition at 31,88 frames per second

    A Floating-point Extended Kalman Filter Implementation for Autonomous Mobile Robots

    No full text
    Localization and Mapping are two of the most important capabilities for autonomous mobile robots and have been receiving considerable attention from the scientific computing community over the last 10 years. One of the most efficient methods to address these problems is based on the use of the Extended Kalman Filter (EKF). The EKF simultaneously estimates a model of the environment (map) and the position of the robot based on odometric and exteroceptive sensor information. As this algorithm demands a considerable amount of computation, it is usually executed on high end PCs coupled to the robot. In this work we present an FPGA-based architecture for the EKF algorithm that is capable of processing two-dimensional maps containing up to 1.8 k features at real time (14 Hz), a three-fold improvement over a Pentium M 1.6 GHz, and a 13-fold improvement over an ARM920T 200 MHz. The proposed architecture also consumes only 1.3% of the Pentium and 12.3% of the ARM energy per feature.CAPES[BEX2683/06-7]Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)EPSRC[EP/C549481/1]EPSRCEPSRCEPSRC[EP/C512596/1

    Scaling Up Modulo Scheduling for High-Level Synthesis

    No full text

    Exploiting Kant and Kimura's Matrix Inversion Algorithm on FPGA

    No full text
    Matrix inversion for real-time applications can be a challenge for the designers since its computational complexity is typically cubic. Parallelism has been widely exploited to reduce such complexity, however most traditional methods do not scale well with the matrix size leading to communication bottlenecks. In this paper we exploit a decentralised parallel hardware architecture based on a strongly non-singular matrix inversion algorithm proposed by Kant and Kimura in 1978, which is a parallel-orientated method with communication mode independent of the matrix size, mitigating the problem of matrix scalability. The hardware architecture is implemented in two different approaches using fixed-point arithmetic: dedicated and shared. In the first approach a matrix can be inverted in linear time while the latter, for the best case, has a square complexity. Experimental results are demonstrated using a Stratix V GX FPGA. For instance, in dedicated approach an 8x8 matrix is inverted in 1.27us, while in shared approach a 64x64 matrix is inverted in 153.40us using 64 pipelined processing elements.FAPESP (Sao Paulo Research Foundation

    Practical Education Fostered by Research Projects in an Embedded Systems Course

    Get PDF
    The very nature of universities makes them unique environments for research and teaching. Although both activities constantly borrow from each other, a deeper level of interaction is not always achieved for several reasons. This paper presents a successful experience on conducting an undergraduate course on embedded systems, based on strong interaction with related research activities previously conducted by the authors. Known for being everywhere, embedded systems are constantly expanding in both complexity and volume production. In addition, heterogeneous systems are becoming prevalent in modern applications, standing as an additional difficulty to students in this area. In this context, this paper presents experiences in teaching embedded systems using a project-based learning pedagogical approach, with strong emphasis on mobile robotic applications previously developed by MSc and PhD students. As a result, it has been observed that undergraduate students have the opportunity to build a strong background and feel better prepared to face the challenges to be found in their future professional activities
    corecore